Santander Department
Long-form factuality in large language models Jerry Wei 1 Chengrun Y ang 1 Xinying Song 1 Yifeng Lu
To benchmark a model's long-form factuality in open domains, we first use GPT -4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE).
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Oceania > Australia > South Australia > Adelaide (0.14)
- (48 more...)
- Research Report > Experimental Study (1.00)
- Personal > Honors (0.67)
- Media > Television (1.00)
- Media > Music (1.00)
- Media > Film (1.00)
- (22 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Oceania > Australia > South Australia > Adelaide (0.14)
- (50 more...)
- Research Report > Experimental Study (1.00)
- Personal > Honors (0.67)
- Overview (0.67)
- Media > Television (1.00)
- Media > Music (1.00)
- Media > Film (1.00)
- (22 more...)
See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models
Contreras, Kebin, Toscano-Palomino, Luis, Mura, Mauro Dalla, Bacca, Jorge
Recovering the past from present observations is an intriguing challenge with potential applications in forensics and scene analysis. Thermal imaging, operating in the infrared range, provides access to otherwise invisible information. Since humans are typically warmer (37 C -98.6 F) than their surroundings, interactions such as sitting, touching, or leaning leave residual heat traces. These fading imprints serve as passive temporal codes, allowing for the inference of recent events that exceed the capabilities of RGB cameras. This work proposes a time-reversed reconstruction framework that uses paired RGB and thermal images to recover scene states from a few seconds earlier. The proposed approach couples Visual-Language Models (VLMs) with a constrained diffusion process, where one VLM generates scene descriptions and another guides image reconstruction, ensuring semantic and structural consistency. The method is evaluated in three controlled scenarios, demonstrating the feasibility of reconstructing plausible past frames up to 120 seconds earlier, providing a first step toward time-reversed imaging from thermal traces.
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- South America > Colombia > Santander Department (0.04)
- Asia > Middle East > Iran (0.04)
Federative ischemic stroke segmentation as alternative to overcome domain-shift multi-institution challenges
Rangel, Edgar, Martinez, Fabio
Stroke is the second leading cause of death and the third leading cause of disability worldwide. Clinical guidelines establish diffusion resonance imaging (DWI, ADC) as the standard for localizing, characterizing, and measuring infarct volume, enabling treatment support and prognosis. Nonetheless, such lesion analysis is highly variable due to different patient demographics, scanner vendors, and expert annotations. Computational support approaches have been key to helping with the localization and segmentation of lesions. However, these strategies are dedicated solutions that learn patterns from only one institution, lacking the variability to generalize geometrical lesions shape models. Even worse, many clinical centers lack sufficient labeled samples to adjust these dedicated solutions. This work developed a collaborative framework for segmenting ischemic stroke lesions in DWI sequences by sharing knowledge from deep center-independent representations. From 14 emulated healthcare centers with 2031 studies, the FedAvg model achieved a general DSC of $0.71 \pm 0.24$, AVD of $5.29 \pm 22.74$, ALD of $2.16 \pm 3.60$ and LF1 of $0.70 \pm 0.26$ over all centers, outperforming both the centralized and other federated rules. Interestingly, the model demonstrated strong generalization properties, showing uniform performance across different lesion categories and reliable performance in out-of-distribution centers (with DSC of $0.64 \pm 0.29$ and AVD of $4.44 \pm 8.74$ without any additional training).
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Projection-Based Correction for Enhancing Deep Inverse Networks
Deep learning-based models have demonstrated remarkable success in solving ill-posed inverse problems; however, many fail to strictly adhere to the physical constraints imposed by the measurement process. In this work, we introduce a projection-based correction method to enhance the inference of deep inverse networks by ensuring consistency with the forward model. Specifically, given an initial estimate from a learned reconstruction network, we apply a projection step that constrains the solution to lie within the valid solution space of the inverse problem. We theoretically demonstrate that if the recovery model is a "well-trained deep inverse network", the solution can be decomposed into range-space and null-space components, where the projection-based correction reduces to an identity transformation. Extensive simulations and experiments validate the proposed method, demonstrating improved reconstruction accuracy across diverse inverse problems and deep network architectures.
- South America > Colombia > Santander Department > Bucaramanga (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
- Education (0.37)
A multitask transformer to sign language translation using motion gesture primitives
López, Fredy Alejandro Mendoza, Rodriguez, Jefferson, Martínez, Fabio
The absence of effective communication the deaf population represents the main social gap in this community. Furthermore, the sign language, main deaf communication tool, is unlettered, i.e., there is no formal written representation. In consequence, main challenge today is the automatic translation among spatiotemporal sign representation and natural text language. Recent approaches are based on encoder-decoder architectures, where the most relevant strategies integrate attention modules to enhance non-linear correspondences, besides, many of these approximations require complex training and architectural schemes to achieve reasonable predictions, because of the absence of intermediate text projections. However, they are still limited by the redundant background information of the video sequences. This work introduces a multitask transformer architecture that includes a gloss learning representation to achieve a more suitable translation. The proposed approach also includes a dense motion representation that enhances gestures and includes kinematic information, a key component in sign language. From this representation it is possible to avoid background information and exploit the geometry of the signs, in addition, it includes spatiotemporal representations that facilitate the alignment between gestures and glosses as an intermediate textual representation. Keywords: Sign language translation, gloss, transformer, deep learning representations 2010 MSC: 00-01, 99-00 1. Introduction Approximately 1 .5 billion people have some associated degree of hearing loss worldwide. These languages are composed of visio-spatial gestural movements and expressions, together with complex manual and non-manual interactions. Today there are more than 150 official SLs with multiple variations in each country. Like any language, there is an intrinsic grammatical richness with multiple gestural and expressive variations. These aspects make the modeling of SLs a very challenging task, even for the most advanced computer vision and representation learning methodologies. In fact, signs do not have a direct written representation, which makes it more difficult to structure the language, implying major challenges to find correspondence with other textual languages.
Development of a Deep Learning Model for the Prediction of Ventilator Weaning
Gonzalez, Hernando, Arizmendi, Carlos Julio, Giraldo, Beatriz F.
The issue of failed weaning is a critical concern in the intensive care unit (ICU) setting. This scenario occurs when a patient experiences difficulty maintaining spontaneous breathing and ensuring a patent airway within the first 48 hours after the withdrawal of mechanical ventilation. Approximately 20 of ICU patients experience this phenomenon, which has severe repercussions on their health. It also has a substantial impact on clinical evolution and mortality, which can increase by 25 to 50. To address this issue, we propose a medical support system that uses a convolutional neural network (CNN) to assess a patients suitability for disconnection from a mechanical ventilator after a spontaneous breathing test (SBT). During SBT, respiratory flow and electrocardiographic activity were recorded and after processed using time-frequency analysis (TFA) techniques. Two CNN architectures were evaluated in this study: one based on ResNet50, with parameters tuned using a Bayesian optimization algorithm, and another CNN designed from scratch, with its structure also adapted using a Bayesian optimization algorithm. The WEANDB database was used to train and evaluate both models. The results showed remarkable performance, with an average accuracy 98 when using CNN from scratch. This model has significant implications for the ICU because it provides a reliable tool to enhance patient care by assisting clinicians in making timely and accurate decisions regarding weaning. This can potentially reduce the adverse outcomes associated with failed weaning events.
- South America > Colombia > Santander Department > Bucaramanga (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- South America > Colombia > Risaralda Department > Pereira (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Diagnosis of Patients with Viral, Bacterial, and Non-Pneumonia Based on Chest X-Ray Images Using Convolutional Neural Networks
Arizmendi, Carlos, Pinto, Jorge, Arboleda, Alejandro, González, Hernando
According to the World Health Organization (WHO), pneumonia is a disease that causes a significant number of deaths each year. In response to this issue, the development of a decision support system for the classification of patients into those without pneumonia and those with viral or bacterial pneumonia is proposed. This is achieved by implementing transfer learning (TL) using pre-trained convolutional neural network (CNN) models on chest x-ray (CXR) images. The system is further enhanced by integrating Relief and Chi-square methods as dimensionality reduction techniques, along with support vector machines (SVM) for classification. The performance of a series of experiments was evaluated to build a model capable of distinguishing between patients without pneumonia and those with viral or bacterial pneumonia. The obtained results include an accuracy of 91.02%, precision of 97.73%, recall of 98.03%, and an F1 Score of 97.88% for discriminating between patients without pneumonia and those with pneumonia. In addition, accuracy of 93.66%, precision of 94.26%, recall of 92.66%, and an F1 Score of 93.45% were achieved for discriminating between patients with viral pneumonia and those with bacterial pneumonia.
- South America > Colombia > Santander Department > Bucaramanga (0.05)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States (0.04)
- (2 more...)
Spatio-temporal transformer to support automatic sign language translation
Ruiz, Christian, Martinez, Fabio
Sign Language Translation (SLT) systems support hearing-impaired people communication by finding equivalences between signed and spoken languages. This task is however challenging due to multiple sign variations, complexity in language and inherent richness of expressions. Computational approaches have evidenced capabilities to support SLT. Nonetheless, these approaches remain limited to cover gestures variability and support long sequence translations. This paper introduces a Transformer-based architecture that encodes spatio-temporal motion gestures, preserving both local and long-range spatial information through the use of multiple convolutional and attention mechanisms. The proposed approach was validated on the Colombian Sign Language Translation Dataset (CoL-SLTD) outperforming baseline approaches, and achieving a BLEU4 of 46.84%. Additionally, the proposed approach was validated on the RWTH-PHOENIX-Weather-2014T (PHOENIX14T), achieving a BLEU4 score of 30.77%, demonstrating its robustness and effectiveness in handling real-world variations
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- South America > Colombia > Santander Department > Bucaramanga (0.04)
- Europe > France (0.04)
Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise
Monroy, Brayan, Bacca, Jorge, Tachella, Julián
Recorrupted-to-Recorrupted (R2R) has emerged as a methodology for training deep networks for image restoration in a self-supervised manner from noisy measurement data alone, demonstrating equivalence in expectation to the supervised squared loss in the case of Gaussian noise. However, its effectiveness with non-Gaussian noise remains unexplored. In this paper, we propose Generalized R2R (GR2R), extending the R2R framework to handle a broader class of noise distribution as additive noise like log-Rayleigh and address the natural exponential family including Poisson and Gamma noise distributions, which play a key role in many applications including low-photon imaging and synthetic aperture radar. We show that the GR2R loss is an unbiased estimator of the supervised loss and that the popular Stein's unbiased risk estimator can be seen as a special case. A series of experiments with Gaussian, Poisson, and Gamma noise validate GR2R's performance, showing its effectiveness compared to other self-supervised methods.
- South America > Colombia > Santander Department (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)